Handling Irregularities in ROADRUNNER

نویسندگان

  • Valter Crescenzi
  • Giansalvatore Mecca
  • Paolo Merialdo
چکیده

We report on some recent advancements on the development of the ROADRUNNER system, which is able to automatically infer a wrapper for HTML pages. One of the major drawbacks of the ROADRUNNER approach was its limited ability in handling irregularities in the source pages. To overcome this issue, we have developed a technique to deal with chunks of unstructured HTML code. Several experiments have been conducted to evaluate the effectiveness of the approach, producing encouraging results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Truncated Differential Analysis of Round-Reduced RoadRunneR Block Cipher

RoadRunneR is a small and fast bitslice lightweight block cipher for low cost 8-bit processors proposed by Adnan Baysal and Sähap Şahin in the LightSec 2015 conference. While most software efficient lightweight block ciphers lacking a security proof, RoadRunneR’s security is provable against differential and linear attacks. RoadRunneR is a Feistel structure block cipher with 64-bit block size. ...

متن کامل

Automatic Web Information Extraction in the ROADRUNNER System

This paper presents roadRunner, a research project that aims at developing solutions for automatically extracting data from large HTML data sources. The target of our research are data-intensive Web sites, i.e., HTML-based sites with a fairly complex structure, that publish large amounts of data. The paper describes the top-level software architecture of the roadRunner System, and the novel res...

متن کامل

The RoadRunner Web Data Extraction System

Extracting data from HTML text files and making them available to computer applications is becoming of utmost importance for developing several emerging e-services. This paper presents RoadRunner, a research project that aims at developing solutions for automatically extracting data from large HTML data sources. We concentrate on data-intensive Web sites, that is, sites that deliver large amoun...

متن کامل

Roadrunner: Infrastructure-less Vehicular Congestion Control

RoadRunner is an in-vehicle app for traffic congestion control without costly roadside infrastructure, instead judiciously harnessing vehicle-to-vehicle communications, cellular connectivity, and onboard computation and sensing to enable large-scale traffic congestion control at higher penetration and finer granularity than previously possible. RoadRunner limits the number of vehicles in a cong...

متن کامل

First experience of compressible gas dynamics simulation on the Los Alamos roadrunner machine

We report initial experience with gas dynamics simulation on the Los Alamos Roadrunner machine. In this initial work, we have restricted our attention to flows in which the flow Mach number is less than 2. This permits us to use a simplified version of the PPM gas dynamics algorithm that has been described in detail by Woodward (2006). We follow a multifluid volume fraction using the PPB moment...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004